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^ , Abstract. In counting experiments, one can set an upper hmit on the rate of a 

I ' Poisson process based on a count of the number of events observed due to the process. 

^ ■ In some experiments, one makes several counts of the number of events, using different 

^ \ instruments, different event detection algorithms, or observations over multiple time 

^ ' intervals. We demonstrate how to generalize the classical frequentist upper limit 

calculation to the case where multiple counts of events are made over one or more time 
intervals using several (not necessarily independent) procedures. We show how different 
choices of the rank ordering of possible outcomes in the space of counts correspond 
[ to applying different levels of significance to the various measurements. We propose 

an ordering that is matched to the sensitivity of the different measurement procedures 
and show that in typical cases it gives stronger upper limits than other choices. As 
I an example, we show how this method can be applied to searches for gravitational- 

' wave bursts, where multiple burst-detection algorithms analyse the same data set, and 

OO . demonstrate how a single combined upper limit can be set on the gravitational-wave 

' burst rate. 
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• One of the most familiar applications of classical confidence intervals is to the counting 
experiment, in which one attempts to measure or place a limit on the rate of a physical 
Poisson process by counting the number of occurrences of the process observed during 
some period of time. For example, for a single measurement (a single count of events) 
with low background and an expected physical rate comparable to or lower than the 
background, one typically sets an upper limit; i.e., a one-sided confidence interval. 
Given a count n, the upper limit is that value of the physical rate such that the a priori 
probability of measuring more than n events in the experiment exceeds some chosen 
confidence level. 

Various issues may complicate the procedure for setting the upper limit. For 
example, if the background is large, there is a well-known problem that the upper 
confidence limit may be the empty set when the observed number of events is much lower 
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than that expected from the background. Another more subtle issue is that the decision 
to report an upper hmit versus a two-sided confidence interval can, if based on the 
data, cause under cover age, rendering the procedure invalid. Techniques for addressing 
these issues have been presented in the literature, for example, by the Feldman-Cousins 
technique [Ij and the loudest event technique [2l E]. These can also be addressed by 
Bayesian methods; see for example [3], HI O [6l [71 [8] . 

In this paper we are concerned with a different complication: when more than one 
count is made of the number of events. One example of where this situation arises is 
searches for gravitational- wave (GW) bursts with LIGO and similar detectors p, [TOl[TT] . 
In this scenario the GW signals are expected to have amplitudes near the noise floor of 
the detectors, and the rate of detectable events is expected to be of order the inverse 
of the observation time or less. To improve chances of detection, multiple algorithms 
are used to analyse the data [121 lEl HI] , each producing its own list of candidate GW 
bursts. The event lists produced by these algorithms, however, are not completely 
independent. They will generally show some correlation between which foreground 
events they detect, and may also show some correlation between the background noise 
fluctuations they detect. Furthermore, the data set itself typically is not of uniform 
sensitivity. For example, the longest data- collect ion run to date for the LIGO-GEO- 
Virgo network lasted more than two years [15]. Over this time the sensitivity of each 
of the instruments changed, and at any given time during the run, anywhere between 
1 and 5 detectors may have been operating. The challenge to the data analyst in such 
an experiment is this: given multiple counts of events collected from processing several 
data sets of different sensitivities and with different algorithms, how does one set a single 
limit on the physical event rate? 

There are many options. The simplest is to take the union of all of the event 
lists and observation time, effectively converting the multiple observations into a single 
observation, and computing the upper hmit using a standard technique. This approach 
ignores differences in the quality of the data from the different epochs, and in the 
algorithms themselves. Alternatives include discarding results from select data sets or 
algorithms (presumably the less sensitive ones), again with the aim of reducing the 
observations to effectively a single count. These approaches invariably involve loss of 
information from the experiment. Intuitively, one expects to be able to set stronger 
limits if one uses all of the information from the experiment rather than only a subset 
of the information. 

In this article we propose a general formalism for setting classical upper limits 
on experiments involving multiple pipelines, where a pipeline denotes the analysis of 
a single data set by a single algorithm. We characterize the observational results and 
the sensitivities of the experiment in terms of logical combinations of pipelines. We 
show that various choices such as taking the union of data sets correspond to particular 
choices of weighting of measurements. We propose a speciflc weighting choice based 
on the efficiencies (sensitivities) of the logical combinations, and show that it gives 
stronger upper limits than other choices in typical cases. Furthermore, the efficiency 
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weighting choice makes use of all of the experiment results, naturally handles correlated 
measurements, and tends to be robust against occasional background contamination of 
counts. 

This paper is organized as follows. In Section [2] we review how one sets a classical 
upper limit on the rate of a Poisson-distributed process in a counting experiment. In 
Section [3] we generalize the single-count procedure to the case of multiple counts. We 
discuss various choices of the weighting to obtain upper limits, including our sensitivity- 
based proposal. We demonstrate each procedure for the case of a counting experiment 
using two pipelines, with and without background. In Section H] we demonstrate how 
the same procedure naturally handles multiple data sets. Section [5] contains a few brief 
remarks on the applicability of the method. 

2. Single-Pipeline Case 

We briefly review how one sets a classical upper limit (a one-sided confidence interval) 
on the rate of a Poisson-distributed process via a counting experiment. 

Consider an experiment that measures the number of events of a specific random 
process that occurs in a time T. We assume that the foreground events occur 
independently of one another, with a mean rate /i that is unknown a priori. We further 
assume that the experiment has a probability e of successfully detecting (counting) any 
given event. Finally, we assume that the mean number of background events (due to 
"noise" or effects other than the physical effect of interest) in time T is b. Then the 
actual total number of events (foreground plus background) that will be counted in a 
given time T is Poisson distributed, as is easily demonstrated. 

Let us divide the observation time T into M equal sub-intervals of length T/M. 
In the limit of large M, the probability of one event being detected in any given sub- 
interval is {efiT + b)/M <^ 1, and the probability of more than one event in the same 
interval is negligible. The probability of detecting a total of N events over the full time 
T is derived from binomial statistics as the probability of N "successes" in M "trials" . 
Defining A = fiT as the expected mean number of foreground events occurring, we have 



This is the familiar Poisson distribution for a process with mean number of detected 
events eA + b. 

Given an actual measured number n, the Poisson distribution ([1]) can be used to set 
an upper limit on the value of A, or equivalently on fi. Heuristically, values of A much 
larger than {n — b)/e are unlikely to produce only n detected events. More formally, we 
select a confidence level a G (0, 1). The frequentist upper limit A^ at confidence level a 
given n measured events is that value of A at which there is an a priori probability a of 




(6A + b) 



(tX+b) 



(1) 
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measuring more than n events. Implicitly, Aq, is given by 

oo n 

a= Pi^leX^ + 6) = 1 _ ^ P{N\eX^ + b) . (2) 

N=n+1 N=0 

We define the cumulative probability C{n\eX + b) as the a priori probability of detecting 
n or fewer events: 

n 

C{n\e\ + b) = J2PiN\e\ + b) . (3) 

Af=0 

We can write the upper limit formula for A^ as 

C{n\eX^ + b) = l-a. (4) 

For example, the 90% confidence level {a = 0.9) upper limit for zero observed events 
{n = 0) and zero background (6 = 0) is 

2 30 

0.1 = C(0|eA9o%) = e-'^'->'>^" , A9o% = • (5) 
For n = 1 observed events the upper limit is higher (weaker): 

0.1 = (1 + eA9o%)e-^^«o^» , A9o% = ^ • (6) 

To be rigorous, one must prove that the upper limit formula (jl]) has a coverage of 
at least a. The coverage is defined as the fraction of measurements in an ensemble of 
identical experiments for which the derived upper limit is greater than or equal to the 
true rate Atrue- To be a valid upper limit with confidence level a, one must show that 
Aq > Atrue in a fraction > a of experiments for any possible value o/ Atrue- 

It is straightforward to prove that the upper limit formula (jlj) has the coverage a. 
First, we note two properties of C{n\eX + b): 

C{n\e\ + b) >C{m\e\ + b) forn > m ; (7) 

W + (8) 

dA 

Let us suppose that the true value of the rate is Atrue- Let m be the largest integer 
such that C(m|eAtrue + b) < 1 — a. By definition of m, in a fraction > a of 
experiments the measured number of events n will be larger than m. For these cases 
C(?2|eAtrue + b) > 1 — a. Applying the upper limit formula (jl]) and noting (|8]), we see 
that in these cases the derived upper limit A^ will be greater than Atrue- The coverage 
is thus established. 

We should note that one has the freedom to ignore the experimental background 
when computing the upper limit; i.e., one may use the approximation 6 = 0. Since the 
background will increase n above the value due to the physics of interest, the upper 
limit derived using 6 = remains valid (provides minimum coverage), though it will be 
higher than if we had accounted for the background. We will use this approximation in 
some of our worked examples. 

t From HI), dC{n\eX + b)/dX = -e(eA + by'e-'^-'' /nl < for A > 0, 6 > 0. 
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We also note the well-known phenomenon that the classical one-sided confidence 
interval procedure can produce an empty upper limit when the number of observed 
events is much lower than the background. For example, for n = observed events and 
6 = 3 the 90% upper limit is the solution of 

0.1 = C(0|eA9o% + 3) = e-'^^"'^'-^ . (9) 

This has no solution with A9o% > 0. Methods for handling this issue have been proposed, 
for example, by Feldman and Cousins [1]. In this paper we consider only one-sided 
confidence intervals, and therefore we will restrict ourselves to case where b ^1. 

3. Multiple-Pipeline Case 

3.1. Formulation 

The simplest example of a multiple-pipeline experiment is one in which two different 
methods or "pipelines" are used to count events (by processing the same data, watching 
the same sky, etc.) over the same epoch T. (We'll consider the case of disjoint data 
sets in Section HI) Denote the pipelines by A and B. Any given event may be detected 
by pipeline A only, by pipeline B only, by both A and B, or by neither pipeline. We 
characterize the sensitivity of the experiment by the three numbers e^, e_B, and eAB- 

eA- The probability that any given foreground event will be detected by pipeline A but 
not detected by pipeline B] 

es- The probability that any given foreground event will be detected by pipeline B but 
not detected by pipeline A; 

eAB'- The probability that any given foreground event will be detected by both pipeline 
A and pipeline B. 

We denote the expected background by the three numbers 6^, and 6^5 : 

hA'- The expected number of background events detected by pipeline A but not detected 
by pipeline B; 

hs'- The expected number of background events detected by pipeline B but not detected 
by pipeline A] and 

hAB'- The expected number of background events detected by both pipeline A and 
pipeline B. 

Finally, the outcome of the counting experiment is the set of three numbers ua-, "^b, and 
riAB- 

Ua'- The number of events detected by pipeline A but not detected by pipeline B\ 
ub- The number of events detected by pipeline B but not detected by pipeline A; and 
riAB- The number of events detected by both pipeline A and pipeline B. 
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To interpret {nA,nB,nAB) in terms of an upper limit on A, we first need to compute 
the joint probability P{{nA, ns, uabI^, ^a, ^b, ^ab, ^a, &_b, ^ab)- This is straightforward; 
repeating the logic of the single-pipehne case, it is easy to see that 

P{Na, Nb, Nab\>^, ca, es, cab, &a, ^ab) 

lim (m-Na\ (m-Na-Nb 

m"L \Na)\ Nb )\ Nab 

eAX + bAV^ feBX + bB^""^ 



M ) \ M 

^AB^ + Oab \ I ^ _ ^TOT^ + OtOT \ 



M J \ M J 

= P{NA\eAX + bA)P{NB\eBX + bB)P{NAB\eABX + • (10) 
Here we have defined the total number of events detected, 

Ntot = Na + Nb + Nab. (11) 
the total number of events expected from background, 

bTOT = bA + bB + bAB , (12) 

and the probability of a given foreground event being detected by any combination of 
pipelines, 

ctot = ca + + Cab • (13) 

We see that by choosing to characterize the outcome of the experiment by the number 
of events detected by logical combinations of pipelines, the joint probability factorizes 
to the product of single-pipeline probabilities ([1]). The measurements of A^^, Nb, and 
Nab can therefore be regarded as statistically independent experiments. This is a key 
simplification that makes deriving a combined upper limit straightforward. 

In the general case of p pipelines, there are g = 2^ — 1 distinct combinations by 
which an event may be detected. Using the vector notation A^, e, and b, where the 
vector index i G [1, . . . , g] labels the distinct combinations, we have 

<? 

P{N\Xe + b) = l[PiNi\Xei + bi). (14) 

i=l 

3.2. Defining an Upper Limit 

To set an upper limit we need first to define a cumulative probability distribution 
C(n|Ae + b) corresponding to (IT^ . analogous to (j3]). Since the space of observation 
{A^} is multi-dimensional, we have a great deal of freedom in how we choose to sum 
over {A^} to define the cumulative distribution. Put another way, we must chose a rank 
ordering of {A^}. (For an unbiased limit, this must be done before the measurement of 
n.) 

To construct a confidence belt, we choose a one-parameter family of surfaces S{C,) 
that foliates the observation space {A^}. This family is chosen so that for every value of 
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the parameter (, the surface S{() divides the space {N} into two regions: a acceptance 
region of low number of events (including the origin, and the surface S{() itself), and 
a rejection region of high number of events. Our choice of the family S{() is arbitrary, 
except that the outward normal to each surface must have non-negative components 
everywhere; this is required to prove coverage, as shown below. As we shall see, our 
freedom in the choice of the S{() corresponds to how the various pipelines are "weighted" 
in contributing to the upper limit. 

Because of the foliation, every point in the observation space lies on exactly one 
such surface, which we refer to as an exclusion surface. Hence, each point N can be 
associated with a single parameter value, C(^)- This gives us a rank ordering of the N 
defining whether a given N' contains "more," the "same," or "fewer" events than A^". 
The family S{() therefore maps the multi-dimensional space {A^} to a one- dimensional 
space. This allows us to define a cumulative probability Cs{n\\e + h) by 

Cs{n\\t + h)= P{N\\e + h), (15) 

N\aN)<C.{n) 

where the sum is taken over all A^ for which C(A^) ^ C(^)! ^-C-, over all A^ that contain 
as few events or fewer than n. 

Given a family of exclusion surfaces S {Q and a measured number of events n, we 
may use the cumulative probabihty Cs to set an upper limit on A in the same way as 
is done for the single-pipehne case. Specifically, for a measured number of events n, the 
upper limit Aq, at confidence level a is 

Cs{n\Kt + h) = l~ a. (16) 

That is, the upper limit Aq, on the rate is that value for which in a fraction a of 
an ensemble of experiments one would measure a number of events that falls in the 
rejection region of S{C,{n)). Put another way, the upper limit is the rate for which one 
should measure "more" than n events (a value of C, larger than C(^)) in a fraction a of 
an ensemble of experiments. 

We will consider various simple choices of families S{C,) and their interpretations 
shortly. First, however, we prove that the algorithm ( fT6i) has coverage a. 



3. 3. Coverage 

We now prove that the upper limit formula f|T6l) has a coverage of at least a. The proof 
follows that for the single-pipeline case in Section [21 Again, we note two properties of 
Cs{n\Xe + h): 

Cs{n\\e + b) > Cs{rn\Xe + b) for ({n) > ({m) ; (17) 

dC,(B|Af+g)^„ (18) 
dA 



(See Appendix Appendix A for the proof of ffTS]) .) Let us suppose that the true value 
of the rate is Atrue- Let m be the vector with nonnegative integer components and with 
the largest value of C('^) such that C5(m|Atruee* +b)<l — a. By definition of m, in a 
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fraction > a of experiments the measured number of events n will have ({n) > C{rn). 
For these cases Cs{({n)\Xt^.^c^ + b) > 1 — a. Applying the upper limit formula f lT6|) and 
noting ( ITSi) . we see that in these cases the derived upper hmit Aq, will be greater than 
Atrue- The coverage is thus established. 

As stated before, our choice of exclusion surfaces is arbitrary except that the 
outward normal to the contour must have non-negative components everywhere. This 
restriction ensures that equation f|T8l) is valid, which in turn is required to prove coverage. 
As in the single-pipeline case, we may chose to ignore the background and use 6 = 
when computing upper limits. Since a non-zero background contribution will increase 
the measured ( over its zero-background value, from ( JT6l) - (|T8l) it follows that the limit 
will be higher than that computed accounting for the background, but coverage will be 
maintained. 



3.4- Choosing Exclusion Surfaces 

We now turn to the question of how to select the family of exclusion surfaces to obtain 
the strongest limits. For simplicity, we restrict ourselves henceforth to the simple case of 
plane surfaces. In this family of exclusion surfaces is set by choosing the vector 

k that is normal to the planes. The parameter for the family is then C(-^) = k ■ N (the 
magnitude of k is irrelevant). For a given observation n the upper limit Aq, is given by 

Cj:{n\X^e + b)= ^ P{N\X^e + b) = l-a. (19) 

N\{n-N)-k>0 

Note that the sum is taken over all satisfying the condition 

{n-N)-k>0. (20) 

We now explore several simple choices of exclusion surfaces with ready physical 
interpretations: taking the logical AND or OR combinations of pipelines, and using 
only the most sensitive pipeline. We then propose a new choice of exclusion surfaces: 
k = e; i.e., we weight the measurements by the relative sensitivity of their pipelines. 
We show that this efficiency-weighted approach has several advantages over the other 
choices discussed. In particular, it gives upper limits that are better than those from 
the other common choices for most outcomes of the experiment. 



3.4.1. OR combination One obvious way to orient the exclusion surfaces is to set the 
normal vector k = (1, !,...,!). This choice treats all distinct pipeline combinations 
equally. For a given observation n the upper limit on A is then given by (fT6l) with the 
sum taken over all N satisfying the condition 

J2N,<Y,n^, (21) 

i i 

or simply 

Ntot < nxoT ■ (22) 
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That is, the upper hmit depends only on the total number of events detected, regardless 
of which pipelines or combinations of pipelines detected them. We see that this choice 
of exclusion contour is equivalent to setting an upper limit based on a single pipeline 
which is formed by taking the "OR" combination of all events detected by all pipelines 
or combinations of pipelines. 

For example, consider the case of two pipelines A and B. Let us assume for 
simplicity that the background is negligible {Ba, &_b, b^B — 0). If no events are detected, 
the upper limit at confidence level a = 0.9 is given by 

0.1 = q^((0,0,0)|A9o%e") 

_ g-<:TOTAgo% (23) 

where e^oT = + + ^ab- This has the solution 

2.30 , 

A9o% = • (24) 

exoT 

This has the same form as in the single-pipeline case, ([5]), with the replacement e — )• eror- 
Now consider the case of one event detected (it does not matter whether the lone 
event is detected by A, by B, or by both). The upper limit is given by 

0.1 = P((0, 0, 0)|A9o%e") + P((l, 0, 0)|A9o%e1 

+ P((0, 1, 0)|A9o%e-) + P((0, 0, l)|A9o%e1 
= (1 + eTOTA9o%)e-^^°^^«'"'" , (25) 

which has the solution 

, 3.89 

A90% = • (26) 

This again has the same form as in the single-pipeline case, ([6]), with the replacement 
e — exoT- 

The OR combination has the advantage that it has the largest efficiency of any 
combination, since an event is counted if any of the pipelines detect it. This leads 
to strong upper limits when no events are detected. The disadvantage is that the 
background is also summed over all pipeline combinations, potentially leading to a high 
false alarm rate and poor limits if any of the pipeline samples are contaminated by 
background. 

3.4-2. AND combination A "conservative," choice for detecting events is to demand 
that all pipelines observe an event for it to be counted as a possible signal. It is easy 
to see that this is equivalent to choosing contours with normal vector A; = (0, . . . , 0, 1). 
The upper limit for observation n is then given by (|T6i) with the sum is taken over all 
satisfying the condition 

Ng<ng, (27) 
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where Ug is the number of events detected in coincidence by all pipelines. Because of 
the factorization of the joint probability ( !T^ . the upper limit becomes 



1 — a 



Nl=0 Ng^l=ONg=0 

OO 

P{N,\eiK + bi) 



.Ni=0 



Af,_i=0 



P{Ng\egK + \ 



(2J 



Na=0 



We see that the upper limit reduces to that for an effective single pipeline formed by 
taking the AND combination of all pipelines. This has the same form as in the single- 
pipeline case, ([5]), with the replacement e — )■ e,. 

Consider again the case of two pipelines A and B with low background. Suppose 
we had decided a priori to compute an AND upper limit. If no events were detected by 
any pipeline, then the 90% confidence upper limit is given by ([5]) with e — )■ eab- 

2.30 



-^90% 



(29) 



Since eAs < ctot, the AND combination gives a weaker limit for a given number of 
measured events. 

Now consider the case in which one event is detected. The limit now depends on 
which pipeline combination detected the event. If only one of the pipelines detected 
the event, then Uq = 0, and the 90% confidence upper limit is given by ( l29l) . If both 
pipelines detected the event then Uq = 1 and 

3.89 



A 



90% 



(30) 



These have the same form as in the single-pipeline case, ([S]), (jS]), with the replacement 
e (^AB- 

The AND combination has the advantage of being the combination least susceptible 
to background contamination, since an event is only counted if it is detected by all 
pipelines. For example, the AND combination is particularly robust if the pipelines 
have different responses to the background noise. The disadvantage is that the efficiency 
is also the lowest of any combination, for the same reason. In particular, the AND 
sensitivity is limited by the least sensitive pipeline. 
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3.4-3. SINGLE combination Another simple choice for setting the upper hmit is to 
consider only the measurement by the single most sensitive pipeline, and ignoring all of 
the others. The most sensitive pipeline is the one with the largest detection efficiency 
computed when ignoring the other pipelines; e.g., for the two-pipehne case it is the 
larger of eA + ^ab (for A) or + ^ab (for B). The procedure for computing the upper 
limit in this case is simply to apply (j4]). We note here that it is another special case 
of the multiple-pipeline procedure. For example, for two pipelines where A is the more 
sensitive, the SINGLE limit is equivalent to choosing 

A; = (1,0,1). (31) 
If no events are detected by A, then the 90% confidence upper limit is given by ([5]) with 
e^eA + eAB- 

^90% = : • (32) 

+ ^AB 

If one event is detected by A, the limit is 

-^90% = : • (33) 

^A + (-AB 

The efficiency and background of the SINGLE combination are intermediate 
between those of the OR and AND combinations. In general, exoT > + ^ab > cab, 
so for a given number of measured events (for example, 0), OR will give the strongest 
limit, AND the weakest, and SINGLE an intermediate value. On the other hand, the 
background is highest for OR and lowest for AND, so there is a greater chance of having 
n > events in the OR combination. Unfortunately, for an unbiased analysis one must 
choose the upper limit method before counting events, so it is difficult to make the best 
choice between the AND, OR, and SINGLE options a priori. 

3.4.4- Efficiency-weighted combination The AND, OR, and SINGLE options are just 
three examples of how one may select the exclusion surfaces for the multiple-pipeline 
counting experiment. As just discussed, the relative strength of the upper limits one can 
achieve with these options depends on the number of events detected by each pipeline 
combination, which one does not know a priori in a blind analysis. 

An obvious drawback of the AND and OR examples is that the exclusion 
surfaces are selected without regard to the known sensitivities e of the various pipeline 
combinations. One expects that the strongest upper limits should involve use of this 
information. As a trivial example, a pipeline combination with zero detection probability 
(ej = 0) should be ignored when setting upper limits (n^ should be ignored). The 
SINGLE combination makes some limited use of the known sensitivities, but throws 
away all of the information produced by the less- sensitive pipelines, even if they are 
only slightly less sensitive than the best one. 

A more natural way to incorporate the efficiency information in the upper limit 
procedure is to orient the exclusion surfaces according to the measured efficiencies. For 
plane exclusion surfaces, the simplest choice is 

k = e. (34) 
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We term this choice the efficiency weighted combination, or EFF. 

Heuristically, the efficiency weighted combination is an intelhgent choice because 
it places the largest emphasis on the measurements made by the most sensitive 
combinations of pipelines. To see one of the desirable properties of this choice, consider a 
repeated experiment. In an ensemble of experiments, the expected number of detections 
by each pipeline combination i is 

{n) = Kne^ + b . (35) 

Suppose the observed number of events is n' in one experiment, and n" in a second. 
Which measurement should give the higher upper limit? If {n" — n') ■ e > 0, then the 
second measurement is consistent with a higher limit on A. If (n" — n') ■ e = 0, then the 
two measurements imply the same upper limit on A. The choice k = e*for the exclusion 
contours enforces these requirements. 



3.5. Example: Rate Limit vs. Amplitude 

Consider once more the case of two pipelines A and B. Let us suppose that the target 
signals are characterized by an amplitude p, and that the detection efficiencies of A and 
B separately, Ea = eA + ^ab, Eb = + e^s, and their logical combinations eA, cb, ^ab, 
are as shown in Figure 1. This scenario is typical of searches for gravitational- wave 
bursts by LIGO and similar detectors [121 HBl [13 [El [H]- Our objective is to set an 
upper limit on A as a function of the signal amplitude p. 

Since both Ea and Eb — ?■ 1 at large p, eAB — )■ 1 as well, while and es are nonzero 
for only a limited range of signal amplitudes. In this toy model, A is sensitive to slightly 
weaker signals than B, so > e^. However, since both and are nonzero, each 
pipeline is able to detect some signals that the other pipehne misses. Therefore, one 
expects that combining the measurements of the two pipelines should be able to provide 
more information on the event rate than either pipeline alone. 

Let us now compare the performance of four different choices of exclusion surfaces: 
AND, OR, SINGLE, and EFF. For the moment, let us ignore any background when 
computing the upper limits; i.e., we will use 6 = 0. (We will compare limits including 
background in the next section.) 

Consider first the case where no events are detected. The upper limits from each 
combination are shown in Figure[2l All combinations give A9o% = 2.3 at high amplitude, 
where cab — ?■ 1. In particular, the EFF upper limit is 

0.1 = Q((0,0,0)|A9o%e-) 

-^90% = , (37) 

identical to the OR limit. The EFF and OR combinations give the strongest limits for 
weak signals because of their better efficiency (which is e-roT, the sum of the efficiencies 
of all pipeline combinations). 
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Figure 1. Efficiencies for two pipelines A and B. In our toy model, the signal is 
characterized by an amplitude p. The dotted lines Ea = eA + ^ab,Eb = (b + ^ab 
show the efficiencies of the two pipelines considered separately. The continuous lines 
show the efficiencies of the logical combinations of the pipelines: tA {A not B), es [B 
not A), and cab [A and B). 



Now consider the case of one event detected by the weaker pipehne B: n = (0, 1, 0). 
The upper hmits are shown in Figure El The OR combination does poorly at high 
amphtudes because of the detected event. The AND hmit is much better at high 
amphtudes because A did not see the event, but still poor at low amplitudes because 
eAB 0. The SINGLE combination performs well, giving the same result as the n = 
case, because it ignores the event counted by the less sensitive pipeline. The EFF upper 
limit is computed by summing over 

N ■e<n-e = eB. (38) 

For signal amplitudes p > 1, is the smallest efficiency, so the allowed terms are 
iV G {(0, 0, 0), (0, 1, 0)}. The EFF limit is then given by 

0.l = P((0,0,0)|A9o%e-) + P((0,l,0)|A9o%e1 

= (1 + eBA9o%)e-^™^^«°^° . (39) 

The extra e^A term makes the EFF upper limit only slightly higher than the 2.3 /exoT 
value obtained in the n = case, as can be seen from Figure [3l For p < 1, > ^ab 
and the upper limit includes additional [eAs^)^^'^ terms. This causes the EFF limit to 
increase, but again only slightly, as eAB is typically much smaller than e^, at these 
low amplitudes. 
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signal amplitude 

Figure 2. Upper limits as a function of signal amplitude when no events are detected. 
All methods give the asymptotic limit 2.3 for large amplitudes. The EFF and OR 
combinations give the strongest limits at low amplitude because they have better 
detection efficiency than the AND and SINGLE combinations. 



We see that the EFF combination effectively ignores the event counted by the 
insensitive pipehne combination B, and gives a hmit as good as or even shghtly better 
than that from the SINGLE combination. 

Now turn to the case in which one event is detected by the more sensitive pipehne, A: 
n = (1, 0, 0). The upper hmits are shown in Figure HI Again, the OR combination does 
poorly at high amplitudes because of the detected event. The SINGLE combination does 
even worse, since the event was found by the more sensitive pipeline, and the SINGLE 
combination has lower efficiency than the OR combination. The AND combination 
again performs well at high amplitudes and poorly at low amplitudes. The EFF upper 
limit is computed by summing over 

N-e<n-e = eA. (40) 

The number of terms in the sum depends on the relative values of e^, e^, and eAB- 
In this simple example, for p > 1.4, e^i = 2eB < ^ab and the allowed terms are 
iV e {(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 2, 0)}. The EFF limit is given by 

0.1 = (1 + eAA9o% + esAgoro + eB^A^o^.Je-™^^"''" . (41) 

Since and are small at high amplitudes, the upper limit is again similar to the n = 
value of 2.3/eTOT- For p < 1.4, > ^ab and the cumulative distribution C^(n|AQ,e) 
in (fT9|) includes additional (e^^A)^-*^ terms. This causes the EFF limit to increase, 
becoming similar to that from the OR combination. In short, the EFF combination 
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Figure 3. Upper limits as a function of signal amplitude when one event is detected 
by the less sensitive pipeline {B). The OR combination asymptotes to the single- 
event value 3.9. The lone event is not counted by the AND, SINGLE combinations, 
which give the n — limit 2.3. The EFF combination ignores the event at high 
amplitudes (where es 0), while at lower amplitudes the EFF limit is very close to 
the SINGLE limit as es ^ £a- The thin dashed line is the best possible upper limit 
from the counting experiment; that for zero observed events using the EFF or OR 
combinations (see Figure [2]). 



gives the strongest limits at high amphtudes because pipehne B should have seen a real 
event there and did not, and it gives the strongest limits at low amplitudes because it 
has better efficiency than the AND combination. 

Finally, consider the case of a single event detected by both pipelines: n = (0, 0, 1). 
In this case all combinations give the asymptotic limit of 3.9 at large amplitudes, as 
seen in Figure |5l The relative limits of the AND, OR, and SINGLE combinations are 
the same as in the n = case. We see, however, that the EFF combination outperforms 
all other combinations (including OR) in the low-amplitude limit. In fact, the EFF 
limit reaches nearly the n = value at low signal amplitudes. This counter-intuitive 
result has a simple explanation: at low amplitudes (p < 1), the probability bab of a 
real event being detected jointly by A and B is much smaller than the probabilities e^, 
€3 of it being detected by either pipeline alone. The observation uab > is therefore 
inconsistent with the hypothesis of a low-amplitude signal. The efficiency weighted 
combination therefore ignores this measurement for the low-amplitude upper limits, 
and the limit is dominated by the measurements ua = = ub- 

It is worth noting that the upper limits obtained from the efficiency-weighted 
procedure are neither monotonic nor continuous; this is most evident in Figure O The 
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Figure 4. Upper limits as a function of signal amplitude when one event is detected 
by the more sensitive pipeline (A). The OR and SINGLE combinations asymptote to 
the single-event value 3.9. The lone event is not counted by the AND combination, 
which gives the n — limit 2.3. The EFF combination ignores the event at high 
amplitudes (where — > 0), while at lower amplitudes the EFF limit is very close to 
the OR limit for n = 1. The thin dashed line is the best possible upper limit from the 
counting experiment: that for zero observed events using the EFF or OR combinations 
(see Figure [2]). 



limits are not monotonic because the efficiencies e^, e_B, e^B of the logical combinations 
of pipelines are not monotonic, as shown in Figure [H The origin of the discontinuities 
is slightly more subtle; it arises from the need to sum over a discrete set of in ( lT9l) . 
For the efficiency-weighted combination, the condition (120|) depends on the assumed 
signal amplitude through the efficiencies, k = e{p). Therefore, the sum may include 
different numbers of terms for different signal amplitudes. The discontinuities occur at 
signal amplitudes where another term satisfies the condition to be included in the sum 
in ( |T9i) . N -e < n-e. In turn, this happens when the ratio of efficiencies equals a rational 
number. We stress that these discontinuities are a general feature of using efficiencies 
to weight the pipeline combinations, and that they are not indicative of any problem 
with the procedure. The upper limits at different p values are limits on different signal 
models, and therefore they need not be continuous or monotonic functions of p. Indeed, 
this behaviour is advantageous, as seen in Figure [5l where the efficiency- weighted upper 
limit is able to drop below the OR limit at low amplitudes. 

In each of the cases considered, efficiency weighting gives upper limits as 
approximately as strong as or stronger than any of the other choices. Without efficiency 
weighting, the best remaining combination is different for the different cases: AND, 
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Figure 5. Upper limits as a function of signal amplitude when a single event is 
detected by both pipelines {A and B). All combinations give the asymptotic limit of 
3.9 at large amplitudes. The EFF combination ignores this event at low amplitudes 
(where cab ^ £a,^b) and tends to the zero-event limit for p < I. The thin dashed line 
is the best possible upper limit from the counting experiment: that for zero observed 
events using the EFF or OR combinations (see Figure [2]). 



OR, and SINGLE each perform best for at least one of the cases tested. While we must 
chose the weighting before measuring n for the upper limit procedure to have the proper 
coverage, there is no way to know a priori whether to choose AND, OR, or SINGLE. The 
efficiency-weighted combination, however, gives optimal or near-optimal performance in 
all cases. 

We can gain insight into the strong performance of the efficiency weighting choice 
by examining the form of the upper limit equation ( |T9i) : 

2 \ 2 

1 - « = (1 + eiA„ + ^ + . . . + e2K + . . .)e-(^^+-)"« . (42) 

The set of efficiencies appearing in the exponential is determined by the choice of 
pipeline combination used for the upper limit. The set of terms appearing in the factor 
in front of the exponential depends on the set of measured events n as well as the pipeline 
combination chosen. As a rule, adding efficiency terms in the exponential decreases the 
upper limit. Adding efficiency terms to the factor in front of the exponential increases 
the upper limit. For the AND and SINGLE combinations, only some of the efficiencies 
appear in the exponential. With the OR and EFF combinations, the efficiencies for all 
pipeline combinations appear in the exponential, giving the maximum efficiency possible 
(eror)- Between these two, the EFF combination will typically give fewer terms in the 
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prefactor when events are detected with the less sensitive pipehne combinations. This 
will result in a lower limit than the OR combination. It may have more terms when 
the most sensitive combination sees the event, thus giving a higher limit than the OR 
combination in these cases. As seen in Figure |5l this loss in upper limit tends to be 
small; since the extra terms are associated with low-efficiency pipeline combinations, 
and appear with powers of those small ej. 

3. 6. Upper Limits with Background 

We have seen that the EFF weighted combination tends to give stronger upper limits 
than the AND, OR, and SINGLE weightings when we ignore the background. We now 
demonstrate by example that this superior performance continues when we account 
for the background as well. We do this by computing the expectation value of the 
upper limit as a function of the true foreground rate A for two scenarios: one with low 
background, and one with high background. 

Let us consider once more the case of our two pipelines A and B. We will work 
initially with a fixed set of efficiencies. 



With this background, on average, pipelines A and B detect the same number of 
background events, and half of the events detected by one are also detected by the 
other. The total expected background is 6tot- We will consider the cases bxoT = 0.1 
( "low background" ) and 6tot = 1 ( "high background" ) . 

A straightforward Monte Carlo analysis was used to estimate the upper limit in 
an ensemble of experiments. Figure |6] shows the mean limits from the AND, OR, 
SINGLE, and EFF combinations as a function of the true value of A G [0, 1] for the 
low background case. Figure [7] shows the mean limits for the high background case. 
In both cases the EFF weighting gives stronger limits than any of the other weightings 
for all values of A tested. The gap between the EFF upper limits and the next best 
limits (from OR) is particularly large for the high-background case. These findings 
support our conclusion that the EFF weighting "protects" the upper limit against 
modest background contamination. 

To get a sense of the robustness of the EFF weighting performance, we repeat the 
Monte Carlo for a range of efficiencies. Specifically, we vary ca over [0,1], < e^, 
and keep e^B = 1 — — so that e-roT = 1 • We use 6tot = 1 ( "high background" ) 
and A = 0.5. Figure [H] shows how the mean upper limit from the EFF weighting varies 
with e^, eB- The mean limits range from 2.91 to 3.41, a variation of less than 20%. 
By contrast, the mean limits from the other weightings (not shown) are always higher: 
> 3.40 (SINGLE); > 3.25 (AND), and = 3.55 (OR). This indicates that the superior 



e = (eA,eB,eAB) = (0.345,0.175,0.480). 
Let us assume the background to be 

h = {bA, bB, bAB) = (1/3, 1/3, 1/3) broT ■ 



(43) 



(44) 
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Figure 6. Mean upper limit as a function of the true foreground rate A in an 
ensemble of experiments with fixed low background. This two-pipeline experiment 
has efficiency {€a,£b,£Ab) = (0.345, O.f 75, 0.480) and background {bA,bB,bAB) = 
(f/30, l/30,f/30). 



performance of the EFF weighting is not rehant on the efficiencies taking particular 
values. 

It can be noted from Figure E] that the EFF limit does not reduce to the 
SINGLE limit (> 3.40) when — 0. This is because the EFF combination becomes 
k = (e^, 0, eAs), whereas the SINGLE weighting is A; = (1, 0, 1). So, the EFF weighting 
maintains a distinction between events detected by A alone and those detected jointly 
by A and B. The result is that the EFF limits are lower than or equal to the SINGLE 
limits as — )■ 0, with equality at e = (0.5, 0, 0.5). 

Finally we note that the EFF weighting, since it is based on efficiency alone, is 
most applicable to the case where the background is relatively small. We concentrate 
on the case where the expected number of events due to background of order 1 or less. 
For much higher backgrounds the optimal weightings should also include information 
on the backgrounds Ba, bAB, ... of the various pipeline combinations. 

4. Multiple Data Sets 

The formalism we have developed for multiple algorithms analyzing a common data set 
can be applied equally well to the analysis of multiple sets of data. For example, we may 
have data from several observation periods, each characterized by the use of a different 
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Figure 7. Mean upper limit as a function of tlie true foreground rate A in an ensemble 
of experiments with fixed high background. This two-pipeline experiment has efficiency 
{eA,eB,eAB) = (0.345,0.175,0.480) and background {bA,bB,bAB) = (1/3,1/3,1/3). 
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Figure 8. Mean upper limit as a function of efficiency e* = {eA,£B, 1 — — es) in 
an ensemble of experiments with background {Ba, Bab) = (l/3j 1/3, 1/3) and true 
event rate A — 0.5. The largest limits occur when eA, cb, and eAB — ^ ^ ^A ^ ^b are 
related by the ratio of small integers, as discussed in Section 15751 
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set of instruments, or over which the sensitivity of the instruments changed, etc. In this 
case, the analyses of the separate data epochs may be considered as separate pipehnes 
for purposes of setting an upper hmit. 

As a simple example, consider the case of a single algorithm used to analyse data 
from two disjoint data sets A and B, with durations Ta, Tb- The sensitivity of the 
experiment is characterized by the two numbers 

e^: The probability that any given foreground event will be detected during period A\ 
e^: The probability that any given foreground event will be detected during period B. 

The background is characterized by 

hA'- The expected number of background events detected during period A] 
Bb- The expected number of background events detected during period B. 

The outcome of the experiment is the set of two numbers 

ua'- The number of events detected during period A; 
hb- The number of events detected during period B. 

Since any given event can be detected during period A or period B but not both, we 
have Cab = 0, 6ab = 0, hab = 0. We see immediately that this is a special case of the 
two-pipeline analysis, where we treat the analysis of the separate data sets as separate 
pipeline measurements. In fact, it is a particularly simple case, as we know cab = 0, 
^AB = 0, Hab = a priori. 

Note that we define the efficiencies e^, e_B in terms of the probability of events 
from anywhere in the entire observation period T being detected during periods A or 
B. We are taking the union of the data sets to treat them as one large set. This is the 
most convenient approach, since it matches precisely how the multiple-pipeline case was 
developed. It saves us from including the separate observation times T4, Tb explicitly 
in our upper limit calculations. Instead, they are included implicitly in the efficiencies. 
For example, has a maximum possible value of T4/ (T4 + Tb). 

For concreteness, let us suppose we have two data sets of equal length, Ta = Tb = 
0.5T. Suppose also that the instruments used were more sensitive during period A, such 
that eA = 3/5, = 2/5, and e-roT = + = 1- Table [1] shows the upper limits 
obtained ignoring the background. We compare the OR (combining event counts from 
both periods), SINGLE (only counting events from the more sensitive period), and EFF 
combinations for zero or one detected event. (The AND combination is not applicable 
to this case, since eAB = 0.) 

In each case, the EFF combination gives the best upper limit. For no detected 
events, the EFF and OR combinations give the limit 2.3 as before. The SINGLE limit 
is a factor 5/3 higher, because it uses only 3/5 of the integrated sensitivity of the 
experiment (e^ = SeTor/^). For one event detected in the less sensitive period B, the 
EFF combination gives the best upper limit - even better than SINGLE. This may be 
surprising, in that the SINGLE upper limit is computed for zero events. We see that 
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Table 1. Comparison of upper limits obtained for various possible outcomes of a 
counting experiment on two data sets A and B with ea — 3/5, eb — 2/5, and ignoring 
background. The cases are: no events detected {n = (0, 0, 0)); one event detected in B 
{n — (0, 1, 0)); one event detected in A (n = (1, 0, 0)). 



n 




upper limit 




OR 


SINGLE 


EFF 


(0,0,0) 


2.3 


3.8 


2.3 


(0,1,0) 


3.9 


3.8 


3.1 


(1,0,0) 


3.9 


6.5 


3.9 



the extra sensitivity gained by including the B measurement in the EFF upper hmit 
more than offsets the loss in the limit due to having a detected event. Finally, for the 
case of one event detected in the more sensitive period A, the EFF limit matches the 
OR limit. Interestingly enough, the SINGLE combination performs worse than EFF in 
all cases; for the given efficiencies, we always get a better limit by using all of the data. 

For a larger difference in efficiencies, the differences in upper limits are more 
pronounced. Table [2] compares the upper limits for = 2/3, = 1/3, eror = 1- 
The OR limits are unchanged. The SINGLE limits are better than those in Table [1] 
because the SINGLE combination now contains 2/3 of the integrated sensitivity of the 
experiment (e^ = 2eTOT/3) instead of only 3/5. The changes in the EFF limits are 
more complicated. For one event detected in the less sensitive period B, the EFF 
combination still gives the best upper limit - slightly better than before, because the 
weighting of B is less than in the previous case. For one event detected in A, the EFF 
limit is between the OR limit and the SINGLE limit. The increase over the limit in 
Table [His due to the fact that for = 2eB, the cumulative sum in ( |T9|) now includes 
the terms N = {(0, 0, 0), (1, 0, 0), (0, 1, 0), (0, 2, 0)}, whereas for = l-Seg it includes 
only iV = {(0,0,0), (1,0,0), (0,1,0)}. 

Note that the EFF limits are particularly robust against background events 
contaminating the less-sensitive data sets. This allows the sub-optimal data to be used 
to strengthen scientific results without fear of "spoiling" the upper limits. In particular, 
note that the average of the upper limits for the single-event cases (n = (1,0,0) and 
(0, 1,0)) is best for the EFF combination in both Table [Hand Table 121 So, if the data 
sets have equal background probability (assumed <^ 1), the EFF combination will on 
average give the best upper limits for low true event rates. 

Finally, since we have seen benefits from treating multiple data sets separately, one 
might ask if we should always split up data sets. In particular, why not sub-divide all 
data sets ad infinitum^ The answer comes from noting that the benefits of the EFF 
combination arise from exploiting differences in the efficiencies ej. If the differences in 
efficiency between two data sets are negligible, then there is no benefit to treating them 
separately. For example, for two sets of data with identical efficiencies, the EFF and 
OR combinations will always give identical limits: since uab = always, choosing k = e 
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Table 2. Comparison of upper limits obtained for various possible outcomes of a 
counting experiment on two data sets A and B with ea — 2/3, eb — 1/3, and ignoring 
background. The cases are: no events detected {n = (0, 0, 0)); one event detected in B 
{n — (0, 1, 0)); one event detected in A (n = (1, 0, 0)). 



n 




upper limit 


OR 


SINGLE EFF 


(0,0,0) 


2.3 


3.5 2.3 


(0,1,0) 


3.9 


3.5 3.0 


(1,0,0) 


3.9 


5.8 4.3 



will always give the same results as = (1, . . . , 1). One therefore gets no benefit from 
sub-dividing epochs of constant sensitivity. 

5. Summary 

We have proposed a general technique for setting upper limits on Poisson processes 
from counting experiments involving multiple data sets and multiple event-counting 
algorithms (which we collectively refer to as multiple "pipelines"). This technique 
is an extension of the standard procedure for one-sided classical confidence intervals. 
There are two key features. First, we characterize the measurements by the logical 
combinations of pipelines - the number of events counted by A-and-B, by A-and-not-B, 
etc. Second, we select a rank-ordering of the space of possible measurements which is 
based on the relative detection efficiencies of these logical combinations. This efficiency 
weighting uses all of the counts from the experiment, but assigns more significance to 
those counts from pipeline combinations which are expected to detect more foreground 
events. We have seen that in typical cases for low background and low foreground 
event rate, the efficiency weighting tends to give stronger upper limits than selecting 
the AND or OR combination of pipelines, or selecting the single most sensitive pipeline 
only. In particular, the efficiency weighting procedure tends to be robust against modest 
background contamination of the event counts. This allows all of the observational 
results to contribute to the upper limit while reducing the chances that background 
contamination of some counts will weaken it. 

In this paper we have focused on computing upper limits; however, the method 
has wider applicability. The characterisation of the experiment in terms of logical 
combinations of pipelines and the subsequent rank ordering effectively reduce the 
space of measurements to one dimension. At this point we are free to apply other 
standard procedures for constructing one- or two-sided confidence intervals. It would be 
interesting, for example, to apply the Feldman-Cousins procedure [1] to produce unified 
upper limits and confidence intervals for our multiple-pipeline experiment; we leave this 
consideration to the future. 

As a final note, let us point out that the concept of a "pipeline" is quite general - 
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it is nothing more than a way of defining a count of events. We have seen that different 
pipehnes may consist of different algorithms apphed to the same data, or the same 
algorithm applied to different data sets. Distinct pipelines may also be defined in other 
ways, such as by applying a single algorithm to a single data set and segregating the 
resulting events into groups by some other attribute. For example, in gravitational-wave 
burst searches the background is largely due to events detected at low frequencies (< 200 
Hz). Dividing events into low-frequency (< 200 Hz) and high-frequency (> 200 Hz) sets 
would produce limits on high-frequency gravitational waves that are not compromised by 
the low-frequency background. LIGO matched-filtering searches for gravitational waves 
from inspiralling binaries [191 EO] use a similar idea, dividing the space of templates 
(signal parameters) into several regions. Background events that match templates in one 
region then have minimal impact on the limits set in other regions of the template/signal 
space. Multiple applications of the same algorithm with different counting thresholds 
can also be treated as separate pipelines and handled by our method; this might be 
appropriate when low- and high-amplitude events are produced by separate populations. 
A multiple-threshold approach would have the benefit that events detected with low 
(high) amplitude have minimal impact on the rate limits set on the high (low) amplitude 
population. Our method even naturally handles the case of a "veto" analysis, in which 
one pipeline (B) processes data in such a way as to be deliberately insensitive to signals, 
but sensitive to background noise: e^, e^s — but Bb, Bab > 0. The EFF weighting 
then automatically ignores (vetoes) events detected by A that are also detected by the 
veto pipeline B. 
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Appendix A. Derivative of Cs{n\e,X) 

In this appendix we prove equation ([T8|) . 

dA 

where n, e, b, and the family S{Q are held fixed. 

First, we recall the definition ( |T5l) of the cumulative probability Cs, 

Cs{n\Xe + b)= J2 PiN\Xe + b) 

N\<:{N)<an) 

q 

N\CiN)<C{n) «=1 
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Af|C(JV)<C(ra) 



n 



m 



(A.2) 



Taking the derivative with respect to A yields 



dCs{n\Xe + b) 
dA 



JV|C(JV)<C(n) 

+ ^ H — X 

Nil 

- (ei + . . . + eg) 
X exp [- (ei + . . . 



A^iei(eiA + 6i 



X 



X 



N ' 

l\q. 



+ 



(eiA + bi 
Nil 



X 



X 



(egA + &g 



(A.3) 



A"^) to the sum. We see that the 



Eg) A - (61 + ... + bg)] . 

Consider the contribution of the term A^' = {N[, . 
positive terms arise from taking the derivative of the A^'. Each such positive term is 
exactly cancelled by a negative term coming from the derivative of the exponential from 
the term N" = {N[, . . . , - 1, . . . , N^). N" will always be included in the sum if N' 
is included because of the requirement that the normal to the surfaces S{C,) must have 
only non-negative components. Therefore, all positive terms in ( lA.Sp are cancelled and 
the derivative must be negative. 
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